Adaptive swarm behavior acquisition by a neuro-fuzzy system and reinforcement learning algorithm
نویسندگان
چکیده
Purpose – A neuro-fuzzy system with a reinforcement learning algorithm (RL) for adaptive swarm behaviors acquisition is presented. The basic idea is that each individual (agent) has the same internal model and the same learning procedure, and the adaptive behaviors are acquired only by the reward or punishment from the environment. The formation of the swarm is also designed by RL, e.g., TD-error learning algorithm, and it may bring out a faster exploration procedure comparing with the case of individual learning. Design/Methodology/Approach – The internal model of each individual composes a part of input states classification by a fuzzy net, and a part of optimal behavior learning network which adopting a kind of reinforcement learning methodology named actor-critic method. The membership functions and fuzzy rules in the fuzzy net are adaptively formed online by the change of environment states observed in the trials of agent’s behaviors. The weights of connections between the fuzzy net and the action-value functions of Actor which provides a stochastic policy of action selection, and Critic which provides an evaluation to state transmission, are modified by temporal difference error (TD-error). Findings Simulation experiments of the proposed system with several goal-directed navigation problems were accomplished and the results showed that swarms were successfully formed and optimized routes were found by swarm learning faster than the case of individual learning. Originality/value – Two techniques i.e. fuzzy identification system and reinforcement learning algorithm are fused into an internal model of the individuals for swarm formation and adaptive behavior acquisition. The proposed model may be applied to multi-agent systems, swarm robotics, metaheuristic optimization and so on.
منابع مشابه
Fraud Detection of Credit Cards Using Neuro-fuzzy Approach Based on TLBO and PSO Algorithms
The aim of this paper is to detect bank credit cards related frauds. The large amount of data and their similarity lead to a time consuming and low accurate separation of healthy and unhealthy samples behavior, by using traditional classifications. Therefore in this study, the Adaptive Neuro-Fuzzy Inference System (ANFIS) is used in order to reach a more efficient and accurate algorithm. By com...
متن کاملAdaptive Neuro Fuzzy Sliding Mode Based Genetic Algorithm Control System to Control of a pH Neutralization Process
In this paper, an adaptive neuro fuzzy sliding mode based genetic algorithm (ANFSGA) controlsystem is proposed for a pH neutralization system. In pH reactors, determination and control of pH isa common problem concerning chemical-based industrial processes due to the non-linearity observedin the titration curve. An ANFSGA control system is designed to overcome the complexity of precisecontrol o...
متن کاملVoting Algorithm Based on Adaptive Neuro Fuzzy Inference System for Fault Tolerant Systems
some applications are critical and must designed Fault Tolerant System. Usually Voting Algorithm is one of the principle elements of a Fault Tolerant System. Two kinds of voting algorithm are used in most applications, they are majority voting algorithm and weighted average algorithm these algorithms have some problems. Majority confronts with the problem of threshold limits and voter of weight...
متن کاملVoting Algorithm Based on Adaptive Neuro Fuzzy Inference System for Fault Tolerant Systems
some applications are critical and must designed Fault Tolerant System. Usually Voting Algorithm is one of the principle elements of a Fault Tolerant System. Two kinds of voting algorithm are used in most applications, they are majority voting algorithm and weighted average algorithm these algorithms have some problems. Majority confronts with the problem of threshold limits and voter of weight...
متن کاملMini/Micro-Grid Adaptive Voltage and Frequency Stability Enhancement Using Q-learning Mechanism
This paper develops an adaptive control method for controlling frequency and voltage of an islanded mini/micro grid (M/µG) using reinforcement learning method. Reinforcement learning (RL) is one of the branches of the machine learning, which is the main solution method of Markov decision process (MDPs). Among the several solution methods of RL, the Q-learning method is used for solving RL in th...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Int. J. Intelligent Computing and Cybernetics
دوره 2 شماره
صفحات -
تاریخ انتشار 2009